A bilingual multi-modal voice corpus for language and speaker recognition (LASR) services

نویسندگان

  • Steven D. Beck
  • Reva Schwartz
  • Hirotaka Nakasone
چکیده

Language and channel variations are two important concerns currently affecting practical automatic language and speaker recognition performance. To address these challenges, a corpus of speech was collected from 100 bilingual speakers in each of three foreign languages (Arabic-English, KoreanEnglish, and Spanish-English). The recordings were made in highly controlled conditions using multiple microphones simultaneously, each with different measured response characteristics. The speakers were asked to perform a set of speaking tasks including conversations, text independent readings, and prescribed text readings. These tasks were performed in English and in each speaker’s native language. The equipment, the recording procedures, and the data formats are presented, along with a preliminary analysis of recorded signal quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs

Many  elements  contribute  to  the  relative  difficulty  in  acquiring  specific  aspects  of  English  as  a foreign  language  (Goldschneider  &  DeKeyser,  2001).  Modal  auxiliary  verbs  (e.g.  could,  might), are  examples  of  a  structure  that  is  difficult  for  many  learners.  Not  only  are  they  particularly complex  semantically,  but  especially  in  the  Malaysian  context ...

متن کامل

UCBN: A new audio-visual broadcast news corpus for multimodal speaker verification studies

The performance of face, voice, and multimodal speaker verification systems in complex and non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust multi-modal speaker recognition systems, a new multi-modal (audio-visual) Australian broadcast UCBN (University of Canberra Broadcast News) corpus was...

متن کامل

LPFAV2: a New Multi-Modal Database for Developing Speech Recognition Systems for an Assistive Technology Application

In this paper we report on the acquisition and content of a new database intended for developing audio-visual speech recognition systems. This database supports a speaker dependent continuous speech recognition task, based on a small vocabulary, and was captured in the European Portuguese language. Along with the collected multi-modal speech materials, the respective orthographic transcription ...

متن کامل

A New Multi-modal Database for Developing Speech Recognition Systems for an Assistive Technology Application

In this paper we report on the acquisition and content of a new database intended for developing audio-visual speech recognition systems. This database supports a speaker dependent continuous speech recognition task, based on a small vocabulary, and was captured in the European Portuguese language. Along with the collected multi-modal speech materials, the respective orthographic transcription ...

متن کامل

The MMSR bilingual and crosschannel corpora for speaker recognition research and evaluation

We describe efforts to create corpora to support and evaluate systems that meet the challenge of speaker recognition in the face of both channel and language variation. In addition to addressing ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and crosschannel dimensions. We report on specific data collection efforts at the Linguistic Data Consortium, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004